Revisiting the Codon Adaptation Index from a Whole-genome Perspective: Gene Expression, Codon Bias, and Metabolic Networks in the Context of Genomes Comparison
نویسنده
چکیده
Facts and ideas presented in this short review concern some recent developments at the interface between sequence analysis, gene expression prediction and genome comparison carried on in our group. The guiding line to all results presented here is to derive biological information from genome sequences by means of a purely statistical analysis and an appropriate design of algorithms. 1 Some background and motivation Proteins are formed out of 20 amino-acids which are coded in triplets of nucleotides, called codons. The four nucleotides (A, T, C, G) define 64 codons used in the cell. Codons are not uniformly employed in the cell, but at the contrary, certain codons are preferred and we speak about codon bias. There are several kinds of codon biases and some of them are linked to specific biological functions. Statistical analysis of DNA sequences and in particular of codon bias were performed from the moment that long chunks of DNA sequences were publicly available in the early eighties (Grantham et al. 1980; Wada et al. 1990), and the roots for these studies can be traced back to the sixties (Sueoka 1962; Zuckerkandl and Pauling 1965). However with the increasing number of bacterial genome sequences from a broad diversity of species, this field of research has been revivified in the last few years (Koonin and Galperin 1997; Lin and Gerstein 2000; Radomski and Slonimski 2001; Knight et al. 2001; Sicheritz-Ponten and Andersson 2001; Daubin et al. 2002; Lin et al. 2002; Lobry and Chessel 2003; Sandberg et al. 2003; Jansen et al. 2003). Biased codon usage may result from a diversity of factors: GC-content, preference for codons with G or C at the third nucleotide position (Lafay et al. 1999), a leading strand richer in G+T than a lagging strand (Lafay et al. 1999), horizontal gene transfer which induces chromosome segments of unusual base composition (Moszer et al. 1999), and in particular, translational bias which has been frequently noticed in fast growing prokaryotes and eukaryotes (Sharp and Li 1987; Sharp et al. 1986; Medigue et al. 1991; Shields and Sharp 1987; Sharp et al. 1988;
منابع مشابه
Codon bias patterns in photosynthetic genes of halophytic grass Aeluropus littoralis
Codon bias refers to the differences in the frequency of occurrence of synonymous codons in coding DNA. Pattern of codon and optimum codon utilization is significantly different between the lives. This difference is due to the long term function of natural selection and evolution process. Genetics drift, mutation and regulation of gene expression are the main reasons for codon bias. In this stu...
متن کاملIdentification of Synonymous Codon Usage Bias in the Pseudorabies Virus UL31 Gene
Background: Little knowledge of synonymous codon usage pattern of pseudorabies virus (PRV) genome, especially the UL31 gene in the process for its evolution is available. Objectives: In the present study, the codon usage bias between PRV UL31 sequence and the UL31-like sequences was identified. Materials and Methods: We used a comprehensive analysi...
متن کاملRevisiting the codon adaptation index from a whole-genome perspective: analyzing the relationship between gene expression and codon occurrence in yeast using a variety of models.
Highly expressed genes in many bacteria and small eukaryotes often have a strong compositional bias, in terms of codon usage. Two widely used numerical indices, the codon adaptation index (CAI) and the codon usage, use this bias to predict the expression level of genes. When these indices were first introduced, they were based on fairly simple assumptions about which genes are most highly expre...
متن کاملRevisiting the CAI from a whole-genome perspective: analyzing the relationship between gene expression and codon occurrence in yeast using a variety of models
Highly expressed genes in many bacteria and small eukaryotes often have a strong compositional bias, in terms of codon usage. Two widely used numerical indices, the codon adaptation index (CAI) and the codon usage, use this bias to predict the expression level of genes. Both indices are based on fairly simple assumptions about which genes are most highly expressed, which were known when they we...
متن کاملCodon adaptation index as a measure of dominating codon bias
UNLABELLED We propose a simple algorithm to detect dominating synonymous codon usage bias in genomes. The algorithm is based on a precise mathematical formulation of the problem that lead us to use the Codon Adaptation Index (CAI) as a 'universal' measure of codon bias. This measure has been previously employed in the specific context of translational bias. With the set of coding sequences as a...
متن کامل